The Influence of Minimum Edit Distance on Reference Resolution
نویسندگان
چکیده
We report on experiments in reference resolution using a decision tree approach. We started with a standard feature set used in previous work, which led to moderate results. A closer examination of the performance of the features for different forms of anaphoric expressions showed good results for pronouns, moderate results for proper names, and poor results for definite noun phrases. We then included a cheap, language and domain independent feature based on the minimum edit distance between strings. This feature yielded a significant improvement for data sets consisting of definite noun phrases and proper names, respectively. When applied to the whole data set the feature produced a smaller but still significant improvement.
منابع مشابه
Shape recognition using fuzzy string-matching technique
Object recognition is a very important task in industrial applications. Attributed string matching is a well-known technique for pattern matching. The present paper proposes a fuzzy string-matching approach for two-dimensional object recognition. The fuzzy numbers are used to represent the edit costs. Therefore, the edit distances are also presented as fuzzy numbers. The attributed string-match...
متن کاملTolerant BLEU: a Submission to the WMT14 Metrics Task
This paper describes a machine translation metric submitted to the WMT14 Metrics Task. It is a simple modification of the standard BLEU metric using a monolingual alignment of reference and test sentences. The alignment is computed as a minimum weighted maximum bipartite matching of the translated and the reference sentence words with respect to the relative edit distance of the word prefixes a...
متن کاملThe Generalized Approximate Regularities in Strings
We concentrate on the generalized string regularities and study the minimum approximate λ-cover problem and the minimum approximate λ-seed problem of a string. Given a string x of length n and an integer λ, the minimum approximate λ-cover (resp. seed) problem is to find a set of λ substrings each of equal length that covers x (resp. a superstring of x) with the minimum error, under a variety of...
متن کاملA Quadratic Programming Approach to the Graph Edit Distance Problem
In this paper we propose a quadratic programming approach to computing the edit distance of graphs. Whereas the standard edit distance is defined with respect to a minimum-cost edit path between graphs, we introduce the notion of fuzzy edit paths between graphs and provide a quadratic programming formulation for the minimization of fuzzy edit costs. Experiments on real-world graph data demonstr...
متن کاملThe Undecidability of the Unrestricted Modified Edit Distance
We define the unrestricted modified edit distance based on the modified edit distance defined by Galil and Giancarlo (1989) where the cost of substring deletions and insertions are contextsensitive and the cost of character substitutions are context-free. The modified edit distance is the minimum cost of converting a string X to a string Y where the sequence of edit operations has the property ...
متن کامل